How to Compile a Bilingual Collocational Lexicon Automatically
نویسنده
چکیده
These sentences are extracted from a corpus of the proccedings of the Canadian Parliament, also called the Hansards corpus. As required by law, the Hansards corpus have both the English and the French for each sentence.The corpus consists of a number of pairs of files, one written in English and the other one in French. We used a version of the Hansards in which the sentences have been aligned with their translations as described in [Church91] 2. Sentence (If) is thus the translation in French of Sentence (le). 3
منابع مشابه
Modeling bilingual word associations as connected monolingual networks
Word associations are a common tool in research on the mental lexicon. Studies report that bilinguals produce different word associations in their non-native language than monolinguals, and propose at least three mechanisms responsible for this difference: bilinguals may rely on their native associations (through translation), on collocational patterns, and on the phonological similarity betwee...
متن کاملLexical Functions And Machine Translation
This paper discusses the lexicographical concept of lexical functions (Mel'~uk and Zolkovsky, 1984) and their potential exploitation in the development of a machine translation lexicon designed to handle collocations. We show how lexical functions can be thought to reflect cross-linguistic meaning concepts for collocational structures and their translational equivalents, and therefore suggest t...
متن کاملAutomatically Extracting and Representing Collocations for Language Generation
Collocational knowledge is necessary for language generation. The problem is that collocations come in a large variety of forms. They can involve two, three or more words, these words can be of different syntactic categories and they can be involved in more or less rigid ways. This leads to two main difficulties: collocational knowledge has to be acquired and it must be represented flexibly so ...
متن کاملCompiling Bilingual Lexicon Entries From a Non-Parallel English-Chinese Corpus
We propose a novel context heterogeneity similarity measure between words and their translations in helping to compile bilingual lexicon entries from a non-parallel English-Chinese corpus. Current algorithms for bilingual lexicon compilation rely on occurrence frequencies, length or positional statistics derived from parallel texts. There is little correlation between such statistics of a word ...
متن کاملHarnessing the lawless: using comparable corpora to find translation equivalents
Bilingual dictionaries provide basic translation equivalents for a headword and typically limit the set of equivalents to words of the same part of speech as the headword. However, words taken in their contexts can be translated in many more ways. At the same time, equivalents listed in dictionaries are not adequate in many contexts, because of the contextual and collocational sensitivity of ta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002